Method and system for data integrity protection

ABSTRACT

A method of authenticating a message ( 111 ) received via a transmission channel ( 108 ) using a Message Authentication Code (MAC). The message comprises a message body ( 114 ) and a tag ( 116 ) and the method comprises the steps of generating a second tag ( 115 ) according to a MAC function ( 112 ) on the basis of the received message body and a secret key ( 113 ), calculating a distance ( 117 ) between the received tag and generated second tag, and comparing ( 118 ) the calculated distance with a predetermined threshold value

[0001] This invention relates to a method and system of authenticating amessage received via a transmission channel.

[0002] Data integrity and authenticity are fundamental expectations inany secure data communications system, and they comprise an assurancethat information has not been modified by someone who is not authorizedto do so.

[0003] Data integrity may be provided by Message Authentication Codes(MACs). MACs are used for the integrity protection of datacommunications payload, since they provide a computationally efficientway of protecting large amounts of data.

[0004] Examples of applications where MACs are used are streaming dataapplications. Streaming refers to a technique for transferring data suchthat it can be processed as a steady and continuous stream. Streamingtechnologies are becoming increasingly important with the growth of theInternet, because most users do not have fast enough access todownloading large multimedia files quickly. With streaming, the clientbrowser or plug-in can start processing the data before the entire filehas been transmitted, e.g. for the display of pictures, animations orvideos or the playing of audio presentations. Such multimedia servicesare also part of the emerging third generation mobile telecommunicationsservices.

[0005] MACs are based on a symmetric shared secret between the senderand the receiver. The secret value is called the key. The secret key isan input variable to the MAC calculation. Only somebody who possessesthe correct secret key is able to calculate the MAC value for anarbitrary message. A MAC value is an integrity check value that iscalculated from and appended to the original message data. Uponreceiving a message protected by a MAC, the receiver calculates a MACcheck value on the basis of the received data. If the MAC check value isequal to the received integrity check value, the message is accepted asauthentic. Examples of known MACs include the so-called Keyed-Hashingfor Message Authentication (HMAC) algorithm which is based oncryptographic one-way hash functions such as the secure hash algorithmSHA-1 and the message-digest algorithm MD5. MACs are used to providedata integrity protection in many data communication protocols. Examplesof protocols supporting MACs include the IETF TLS, the SSH and the IPSecprotocols.

[0006] It is a disadvantage of the above prior art methods for dataintegrity protection that they result in a low throughput in cases wheredata is carried over a channel with bit errors, e.g. a wireless channelor other noisy channels.

[0007] This is a particular disadvantage in situations where real timedata or a large amount of data should be transmitted at a high speedover a wireless connection.

[0008] It is a further disadvantage of the above prior art methods thatthe communications performance is sensitive to the bit error rate of thecommunications channel.

[0009] The above and other problems are solved by a method ofauthenticating data, the method comprising the steps of

[0010] receiving a message and a corresponding first data item generatedaccording to a first predetermined rule;

[0011] generating a second data item according to a second predeterminedrule on the basis of the received message; calculating a first distancebetween the received first data item and the generated second data item;

[0012] comparing the calculated first distance with a predetermineddistance value.

[0013] Consequently, the calculation of a distance between the dataitems is introduced, thereby allowing an acceptance of distorted dataand, hence, increasing the throughput of the transmission channels whileproviding a high level of security.

[0014] Bit errors during the transmission of the message or the firstdata item which was calculated on the basis of the original message maycause the calculated second data item to differ from the received firstdata item. However, according to the invention, if the calculated seconddata item is sufficiently close to the first data item, the bit errorsdo not cause the message to be rejected at the receiver even though thecalculated second data item differs from the first data item. Therefore,the number of re-transmissions due to bit errors is small. Consequently,it is an advantage of the invention that it increases the throughput ofa transmission channel with bit errors.

[0015] For an intruder, on the other hand, it is hard to generate aforged message which will result in a second data item closer to thefirst data item than the predetermined threshold distance. Consequently,it is an advantage of the invention that it provides a high level ofsecurity. Preferably, the first and second predetermined rules aresecret, i.e. only known to the transmitter and the receiver,respectively, or they are based upon a secret key value, such that theresult of the rule is hard to predict without knowledge of the secretkey. The use of a secret key has the advantage that the security of themethod is ensured and a flexible method is provided, as thepredetermined rules may be publicly known and used with a plurality ofdifferent keys.

[0016] A message may include data packets, streaming data, multimediadata such as video, television broadcast, video on demand,videoconferencing, voice, audio, animations, or graphics data, or othertypes of data, preferably comprising data where few bit errors do notcorrupt the quality or usefulness of the data significantly.

[0017] The first and second data items may represent numerical values,character strings, bit sequences or other suitable data formats. Thefirst and second data items will also be called tags in the following.Preferably, the first and second data items are cryptographic digests orother suitable MAC values calculated by a MAC mechanism.

[0018] It is a further advantage of the invention that the predetermineddistance value, and thus the tolerated distance between the second andthe first data item, may be adjusted to the known or expected error rateof a transmission channel. Consequently, an adversary can only change asmall number of bits of the order of the bit error rate of thetransmission channel.

[0019] When the method further comprises the step of processing themessage conditioned on a result of the step of comparing the calculatedfirst distance with a predetermined distance value, the received messageis only processed when the comparison yields a desired result.Otherwise, the message may be rejected or made subject to furtherauthentication or verification procedures.

[0020] In a preferred embodiment of the invention the step of processingthe message comprises the step of accepting the message if thecalculated first distance is smaller than the predetermined distancevalue.

[0021] The message and the first data item may be received via atransmission channel, e.g. a transmission channel of a communicationsnetwork, a broadcasting network, a synchronous communications system, anasynchronous communications system, a packet based communicationssystem, or the like.

[0022] When the message and the first data item are received via awireless communications channel, bit errors are likely to occur duringtransmission and the bit error rate may vary over time according tochanges in the transmission quality. Hence, the method according to theinvention is particularly advantageous in connection with wirelesstransmission channels.

[0023] The wireless communications may be radio-based communications,e.g. using Bluetooth™ (Bluetooth is a trademark owned byTelefonaktiebolaget LM Ericsson, Sweden), W-CDMA, GSM, CDMA-2000,TCP/IP, WAP or another suitable protocol. Alternatively, the wirelesscommunications may be based on other electromagnetic radiation such asIR, on acoustic signals, or another wireless communications technology.

[0024] In contrast to the prior art methods mentioned above, few biterrors are tolerated by the method according to the invention withoutcausing rejection and re-transmission of data packets.

[0025] In another preferred embodiment of the invention the step ofgenerating the second data item comprises the step of applying apredetermined permutation to a third data item derived from the message.The third data item may be the received message or a data item which isa result of an initial processing of the received message. According tothis embodiment a few bit changes in a binary representation of themessage result in a few bit changes in the corresponding second dataitem, as the permutation only changes the order of bits and not thenumber of zeros and ones. Correspondingly, the number of bit errors ispreserved by a permutation. This property is particularly advantageousin connection with a distance function that depends on the number of biterrors, such as the Hamming distance. Preferably, the permutation is asecret permutation, or it is based on a secret key value, therebyensuring the security of the method.

[0026] When the step of generating the second data item comprises thestep of combining a fourth data item derived from the message with apredetermined fifth data item, it is difficult for an intruder to forgea message by a simple bit operation.

[0027] In a preferred embodiment of the invention the step of combininga fourth data item derived from the message with a predetermined fifthdata item comprises the step of inserting predetermined binary sequencesat predetermined positions of the fourth data item. It is an advantageof this embodiment of the invention that an inversion of all bits of amessage sequence by an intruder may easily be detected.

[0028] The distance may be calculated by any suitable distancecalculation function, such as the Hamming distance for bit sequences,the difference between numerical values, etc. Preferably, the distancecalculation function implements a distance measure, i.e. a measure ofdifference with certain mathematical properties of being homogenous,subadditive and positive, thereby providing a distance calculation withproperties which may be described mathematically. When the calculateddistance is a Hamming distance, the distance depends only on the numberof bit errors, it has well-known properties and may be efficientlycalculated.

[0029] In a preferred embodiment of the invention the method furthercomprises the steps of

[0030] a) generating a first sequence of message sections from themessage, each message section having a predetermined length;

[0031] b) modifying at least a first message section of the sequence ofmessage sections;

[0032] c) applying at least a first permutation to at least the modifiedfirst message section of the sequence of message sections;

[0033] d) calculating at least one XOR sum of a result of at least thefirst permutation;

[0034] e) calculating a hash value from the calculated at least one XORsum.

[0035] It is an advantage of this embodiment of the invention that itprovides a low forgery probability and efficiently generates a small tagwhich may efficiently be transmitted with the message.

[0036] It is a further advantage of this embodiment of the inventionthat it ensures a reliable authentication which makes it difficult foran intruder to forge the message.

[0037] Here, the term hash value comprises a data item generated from aninput sequence, e.g. a bit sequence, according to a predetermined rule.Preferably, the hash value is smaller than the input sequence and isgenerated such that it is unlikely that two different input sequencesresult in the same hash value. Preferably, the hash value is generatedby a universal hash function or an almost universal hash function.

[0038] When the method further comprises the step of repeating the stepsof applying at least one permutation and calculating an XOR sum of theat least one permutation a predetermined number of times, the securityof the method according to the invention is particularly high.

[0039] In a preferred embodiment of the invention the method furthercomprises the step of repeating the steps a)-e) with differentpermutations. It is an advantage of the invention that it results in asmall tag value and provides a high level of security.

[0040] When the method further comprises the step of encrypting thecalculated hash value, additional security is provided.

[0041] In another preferred embodiment of the invention the step ofcalculating the first distance further comprises the steps of

[0042] dividing the first and second data items into corresponding firstand second sets of data sections;

[0043] calculating a second distance between a first section of thefirst set of sections and a corresponding second section of the secondset of sections; and

[0044] comparing the second distance with a predetermined thresholdvalue.

[0045] It is an advantage of the method according to the invention thatis not sensitive to small bit error rates caused by random transmissionerrors.

[0046] When the first data item is further generated on the basis of afirst secret key code, and the step of generating the second data itemon the basis of the received message further comprises the step ofgenerating the second data item on the basis of a second secret keycode, a high level of forgery protection is achieved. Even withknowledge of the first and second predetermined rules, it is hard for anintruder without knowledge of the first and second secret key codes togenerate a forged message. With no knowledge of the secret key code itis hard to generate a forged message which will result in a second dataitem closer to the first data item than the predetermined thresholddistance. Hence, according to this embodiment, the digest calculation ofthe second data item is based on a cryptographic method, i.e. a methodusing a secret key as one of the inputs, and the distance between thecryptographic digests are compared. Consequently, a high level ofprotection is provided in a single operation without the need forfurther authenticity verification steps.

[0047] The first and second key codes may be different key codes or thesame key code, and the first and second predetermined rules may bedifferent rules or algorithms or they may be the same rule. Preferably,the first and second key codes are a shared secret of the sender and therecipient of the message. Preferably, the first and second predeterminedrules are a MAC mechanism.

[0048] The invention further relates to a method of transmitting amessage from a transmitter to a receiver via a transmission channel, themethod comprising the steps of at the transmitter generating a firstdata item according to a first predetermined rule on the basis of themessage;

[0049] transmitting the message and the generated first data item fromthe transmitter to the receiver;

[0050] generating a second data item according to a second predeterminedrule on the basis of the received message;

[0051] calculating a first distance between the received first data itemand the generated second data item;

[0052] comparing the calculated first distance with a predetermineddistance value.

[0053] In a preferred embodiment of the invention, the generated firstdata item has a size which is smaller than a size of the message.

[0054] It is an advantage of the invention that it is efficient andrequires little overhead in the data transmission.

[0055] In a further preferred embodiment of the invention the step ofgenerating the first data item comprises the step of calculating a hashfunction on the basis of a sixth data item derived from the message.Hence, the transmitted first data item is smaller than the message.

[0056] The invention further relates to a communications systemcomprising

[0057] first processing means adapted to calculate a first data itemaccording to a first predetermined rule on the basis of a message;

[0058] a transmitter adapted to transmit the message and the generatedfirst data item via a transmission channel;

[0059] a receiver adapted to receive the transmitted message and thetransmitted first data item;

[0060] second processing means adapted to generate a second data itemaccording to a second predetermined rule on the basis of the receivedmessage;

[0061] to calculate a first distance between the received first dataitem and the generated second data item; and

[0062] to compare the calculated first distance with a predetermineddistance value.

[0063] The invention further relates to an apparatus comprising

[0064] a receiver adapted to receive a message and a corresponding firstdata item generated according to a first predetermined rule;

[0065] first processing means adapted to generate a second data itemaccording to a second predetermined rule on the basis of the receivedmessage;

[0066] to calculate a first distance between the received first dataitem and the generated second data item; and

[0067] to compare the calculated first distance with a predetermineddistance value.

[0068] The apparatus may be any electronic equipment or part of suchelectronic equipment, where the term electronic equipment includescomputers, such as stationary and portable PCs, stationary and portableradio communications equipment. The term portable radio communicationsequipment includes mobile radio terminals such as mobile telephones,pagers, communicators, e.g. electronic organisers, smart phones, PDAs,or the like.

[0069] In a preferred embodiment of the invention the apparatus is amobile radio terminal;

[0070] The invention further relates to a data signal embodied in acarrier wave for use in a method described above and in the following,the data signal comprising a message body and a first data item.

[0071] The invention further relates to a computer program comprisingprogram code means for performing all the steps of the method describedabove and in the following when said program is run on a microprocessor.

[0072] The invention further relates to a computer program productcomprising program code means stored on a computer readable medium forperforming the method described above and in the following when saidcomputer program product is run on a microprocessor.

[0073] As the advantages of the above aspects of the invention and theirrespective preferred embodiments correspond to advantages of the methodof authenticating a message and its corresponding embodiments describedabove and in the following, these will not be described again.

[0074] The invention will be explained more fully below in connectionwith preferred embodiments and with reference to the drawings, in which:

[0075]FIG. 1 shows a flow diagram of a method according to a firstembodiment of the invention;

[0076]FIG. 2 shows a schematic view of a mapping according to anembodiment of the invention;

[0077]FIGS. 3a-b illustrate the difference between examples of distancefunctions according to embodiments of the invention;

[0078]FIG. 4 shows a schematic view of a method according to a secondembodiment of the invention;

[0079]FIG. 5 shows a block diagram of a communications system accordingto an embodiment of the invention; and

[0080]FIGS. 6a-c show examples of message formats according toembodiments of the invention.

[0081] The invention will be described in the context of messagesrepresented as binary sequences. However, it is understood that a personskilled in the art will be able to carry out the invention with othermessage formats, e.g. by transforming the other message format into abinary sequence. Examples of other message formats include e.g. plaintext, byte representations, hex-values, octal-values, MP3, MPEG, JPEG,TIFF, etc.

[0082]FIG. 1 shows a flow diagram of a method according to a firstembodiment of the invention where integrity protection is provided to amessage m 104 during the transmission of the message from a transmittingside 101 via a transmission channel 108 to a receiving side 110.

[0083] At the transmitting side 101 a MAC value z 105 is calculatedusing a MAC function 102. The MAC function takes the message m and asecret key k 103 as inputs. The MAC value z 105 is combined with theoriginal message m 104 by a concatenation function 106 or a combiningcircuit. The resulting combined message 107 is sent to the receivingside 110 via a transmission channel 108. During the transmission, thereis a risk of the message 107 being altered by an unknown process 109.The alterations may be caused by transmission errors, or they may be dueto modifications of the message by, for example, an unauthorisedintruder. At the receiving side 110, the received message m′ 114 and thereceived MAC value z′ 116 are extracted from the received combinedmessage 111 by an extraction function 101 or an extraction circuit. Onthe basis of the received message m′ 114 and the secret key k 113, a MACvalue z″ is calculated using the MAC function 112. Preferably, the MACfunction 112 implements the same algorithm as the MAC function 102 usedat the transmitting side, and the secret key k 113 is the same key asthe secret key 103 used for calculating the original MAC value z 105.According to the invention, a distance d(z′,z″) between the received MACvalue z′ 116 and the calculated value z″ 115 is calculated by a distancecalculation function 117 which is based on a distance function d(·,·).In a subsequent step 118 the calculated distance d is compared to apredetermined threshold t. If the distance d is larger than thethreshold t, the message is rejected in step 119, otherwise the messageis accepted in step 120.

[0084] When using a prior art MAC method, a single error in a datapacket will result in an incorrect MAC value, and the receiver will notaccept the packet. As a result, the data needs to be retransmitted,irrespective of whether the bit error results from a transmission erroror a change caused by an intruder. Hence, MAC protection over a channelwith a high bit error rate results in a bad throughput. The MAC methodaccording to the invention allows an adversary to change someinformation in a data stream. However, an adversary is not able tochange more than a small amount of the information. When transmittingdata with a lot of information, this is not beneficial for an adversary.On the other hand, when transmitting streaming data over a channel witherrors, the method according to the invention considerably increases thethroughput compared to prior art MACs.

[0085]FIG. 2 shows a schematic view of a mapping according to anembodiment of the invention. In general, a MAC is a function f 206 whichis a mapping from a message space M 201 to a tag space Z. The functionis parameterised by a key k 207 from a key space K (not shown), i.e. theexact mapping is determined by a second input parameter 207 to f, calledthe secret key k, such that for any kεK, mεM, ∃ f(m;k)=z, where zεZ. Forexample, given a value of k, the message m 202 is mapped to the tagz=f(m;k) 209 and the message m2 203 is mapped to the tag z2=f(m2;k).Preferably, the cardinality of Z is less than the cardinality of M inorder to keep the tag size, and, consequently, the required transmissionoverhead for transmitting the tag is small.

[0086] A disadvantage of the prior art MAC methods is that a message isrejected, if the calculated tag value differs from the received tagvalue, irrespective of how much the received message differs from theoriginal message.

[0087] For a MAC according to the invention, also called a streaming MACin the following, a distance function d(·,·) is defined on the tag spaceZ 208. Furthermore, let D(·,·) be a distance function defined on themessage space M 201, and let t₁ 204 and t₂ 212 be predeterminedthreshold values. Preferably, the function f is defined such that for anintruder with knowledge of m 202 but without information about the valueof k 207 it is hard to find a message m′ 203 in M with a distanced1=D(m,m′) 205 to the original message m 202, such that d1>t₁, and witha distance d₂″=d(z,z″) 211 between the corresponding tags which issmaller than t₂ 212 . Furthermore, the function f is preferably definedsuch that it is hard for an intruder with knowledge of m 202 but withoutinformation about the value of k 207 to predict a value z′ 213 in Z suchthat the distance d₂′ 214 between z′ and the correct tag z 209 issmaller than the threshold t₂ 212, i.e. d₂′=d(f(m;k),z′)<t₂.

[0088] Hence, the steps of a method according to an embodiment of theinvention may be summarised as follows:

[0089] Step 1: The transmitter and the receiver in a communicationssystem share a secret value k 207. They may further agree on a distancefunction d and a threshold t₂ 212.

[0090] Step 2: For a message m 202 to be sent from the transmitter tothe receiver a tag value z=f(m;k) 209 is calculated at the transmitter.

[0091] Step 3: The message m 202 and the tag z 209 are sent from thetransmitter to the receiver via a communications channel.

[0092] Step 4: The receiver receives a message m′ 203 and a tag z′ 213.

[0093] Step 5: The receiver calculates z′ f(m′;k) 210.

[0094] Step 6: The receiver calculates the distance d₂ d(z′,z″) 215 andcompares the calculated distance with the threshold t₂ 212.

[0095] Step 7: The receiver accepts the message m′ 203 if and only ift₂≧d₂.

[0096] Once a MAC function and a key is agreed upon, steps 2 through 7may be repeated for a plurality of messages using the same function andthe same key.

[0097] In case of acceptance, the message may be further processed atthe receiver. In case of rejection of a message a request forre-transmission may be sent from the receiver to the transmitter orother measures may be taken, such as informing a user, generating anevent, sending a notification to the transmitter, or the like.

[0098] Now referring to FIGS. 3a-b, a preferred distance function on themessages is the Hamming distance. However, other distance functions maybe used. The Hamming distance between two tuples is defined as thenumber of positions in which their components differ. For example, theHamming distance between the binary tuples (0,0,1,1,1) and (1,1,0,0,1)is equal to four. In the following the Hamming distance between themessage m and the message m′ will be denoted h(m,m′). More formally, hmay be defined as${{h\left( {m,m^{\prime}} \right)} = {\sum\limits_{1 \leq i \leq L}{F\left( {m_{i},m_{i}^{\prime}} \right)}}},$

[0099] where m and m′ are bit sequences of length L, m_(i) is the i-thbit in m, m′_(i) is the i-th bit in m′, and the function F is defined as${F\left( {x,y} \right)} = {\begin{Bmatrix}{1,} & {{\text{if}\quad x} \neq y} \\{0,} & \text{otherwise}\end{Bmatrix}.}$

[0100] In a transmission channel error model, the Hamming distancebetween the sent and received messages corresponds to the number oferrors during transmission. In general, a MAC according to the inventionshould, preferably, have the property that a message is accepted, if thedistance between the sent and received message sequences is small. Forexample, if only a few errors occur during transmission, the receivershould still accept the message.

[0101] Now consider a message m sent by the transmitter and the messagem′ received by the receiver. Preferably, a MAC method according to theinvention is constructed such that all messages with h(m,m′)≦t₁ arealways accepted, i.e., if not more than t₁ errors occur or somebodyalters not more than t₁ bits during transmission, the message isaccepted.

[0102] In the tag space Z, the Hamming distance may also be used as adistance function. Alternatively, different distance functions may beused. Now referring to FIG. 3a, a distance function δ(·,·) according toan embodiment of the invention will be described. According to thisembodiment of the invention, an arbitrary tag value 301 zεZ is dividedinto y different blocks 303-305, preferably of equal size L, such thatz=z₁, Z₂, Z₃, . . . , Z_(y). In the examples of FIGS. 3a-b y=3 and L=4.For a given threshold t₂, a distance function δ on elements in Z isdefined as: ∀ z, Z′εZ,${{\delta \left( {z,z^{\prime}} \right)} = {\sum\limits_{i = 1}^{y}{g\left( {h\left( {z_{i},z_{i}^{\prime}} \right)} \right)}}},$

[0103] where h is the Hamming distance and g is the function:${g(x)} = \left\{ {\begin{matrix}x & \text{if} & {x < t_{2}} \\{y \cdot t_{2}} & \text{if} & {x \geq t_{2}}\end{matrix}.} \right.$

[0104] According to this embodiment, a received message-tag pair, m′,z′,is accepted if and only if δ(z′,z″)<y·t₂, where z″=f(m′;k). In theexamples of FIGS. 3a-b the threshold is t₂=3.

[0105] Hence, a message is rejected, if the Hamming distance between atleast one of the pairs of blocks is larger than a predeterminedthreshold t₂. If, on the other hand, the Hamming distances between allpairs of blocks are smaller than t₂, the message is accepted. Comparedto the Hamming distance, this distance function is less sensitive to biterrors which are randomly distributed over the tag value. This isillustrated by FIGS. 3a-b. In FIG. 3a two tag values z 301 and z′ 302are shown with a Hamming distance of h(z,z′)=4. The bit positions 301 b,301 f, 301 k, 3011 and 302 b, 302 f, 302 k, 3021, respectively, wherethe two tags differ are spread over the entire sequence. Assuming athreshold t₂=3, the distance according to the distance function δdescribed above yields four, as the Hamming distance in none of theblocks exceeds the threshold t₂. The blocks 303 and 308 differ by onebit, hence g(z₁, z₁′)=1. The blocks 304 and 307 differ by one bit, henceg(z₂, z₂′)=1. The blocks 305 and 306 differ by two bits, hence g(z₃,z₃′)=2. In FIG. 3b, the bit sequences z 301 and z′ 312 have the sameHamming distance 4 as in the example of FIG. 3a. However, now the bitpositions 301 b, 301 c, 301 d, 301 f and 312 b, 312 c, 312 d, 312 f,where the tags differ are clustered in the beginning of the sequencesand, correspondingly, the distance function 6 yields a difference of 10:The blocks 303 and 318 differ by three bits, hence g(z₁, z₁′)=y*t₂=9, ast₂≦3. The blocks 304 and 317 differ by one bit, hence g(z₂, z₂′)=1. Theblocks 305 and 316 do not differ, hence g(z₃, z₃′)=0.

[0106] It is an advantage of this distance function that it provides anefficient and secure authentication. In particular, in a constructionwhere the different blocks of the tag value correspond to predeterminedblocks of the respective messages, an authentication method is achievedwhich is more sensitive to localised errors than to randomly distributedbit errors.

[0107] It is understood that a corresponding construction may be definedfor blocks of different length, e.g. in the case where the length of themessage m is not an integer multiple of y. Alternatively, the message orone of the blocks may be padded, e.g. with zeros.

[0108]FIG. 4 shows a schematic view of a method according to a secondembodiment of the invention. The method according to this embodimentutilises three insights described in the following:

[0109] A MAC according to the invention should have the property that asmall distance between messages will result in a small distance betweenthe corresponding tags. A natural measure to use considering channelerrors, is the Hamming distance. If the distance function according tothe invention is based on the Hamming distance, the MAC algorithm shouldhave the property that a few bit changes on a message will result in fewbit changes in the corresponding tag. A mapping that has this propertyis a permutation. A permutation only changes the order of the bits in asequence, but preserves the number of zeros and ones in the sequence. Iftwo different binary sequences differ in, for example, n positionsbefore a fixed permutation is applied on the sequences, the sequenceswill also differ in n positions after the permutation is applied.

[0110] Secondly, considering a message sequences m m₁, m₂, . . . ,m_(n),and applying a random permutation P on this sequence, we denote theoutput sequence by q=q₁, q₂, . . . , q_(n), i.e., q=P(m). Denote byIn(m) the inverse of the sequence m, i.e., the sequence m with each bitflipped. Clearly, P(In(m))=In(P(m)). Hence, if only a permutation isused as the basis for a streaming MAC construction, it will be easy foran adversary to forge a message by simply flipping all bits in themessage and in the corresponding message tag. This may be avoided byadding a fixed binary sequence to the message sequence before thepermutation.

[0111] Thirdly, applying a permutation on a bit sequence results in anew bit sequence with the same length as the original sequence.Preferably a construction of an efficient MAC includes the use of afunction that has an image that is much smaller than the functionpre-image. Such a function is called a hash function and its image ahash value. A hash may be constructed by dividing the message intoblocks of equal size, applying a function with the desired cryptographicproperties on each block, and then sum together all the differentoutputs, for example using a XOR function. The XOR function of asequence of bits may be defined as follows: Consider two binarysequences of length L: b₁=b₁₁, b₁₂, . . . , b_(1L) and b₂=b₂₁, b₂₂, . .. , b_(2L). The XOR sum, b₁⊕b₂ of the two sequences equals b₁₁⊕b₂₁,b₁₂⊕b₂₂, . . . , b_(1L)⊕b_(2L), i.e. the XOR is performed bitwise. Now,consider two binary message sequences of length 2L, m=m₁,m₂=m₁₁,m₁₂, . .. m_(1L),m₂₁,m₂₂, . . . m_(2L) and m′=m′₁,m′₂=m′₁₁,m′₁₂, . . .m′_(1L),m′₂₁,m₂₂, . . . m′_(2L). Assume that the Hamming distancebetween the messages is h(m,m′)=k. Then, it follows that h(m₁⊕m₂,m′₁⊕m′₂)≦k, which is a desired property for MAC according to theinvention. The above relation follows from the fact that for binaryvalues b₁,b′₁,b₂ and b′₂, the inequality b₁⊕b₂≠b′₁⊕b′₂ is valid, if andonly if b₁≠b′₁ and b₂=b′₂ or b₁=b′₁ and b₂≠b′₂.

[0112] Now referring to FIG. 4, a streaming MAC method according to anembodiment of the invention comprises the following steps:

[0113] Step 1: A message m 401 is divided into message blocks 401 a-401d, i.e. m=m₁,m₂, . . . , m₁. Each message block has a predeterminedsize. Preferably, all message blocks have the same size. In this case,if the message length is not a multiple of 1, a fixed sequence may beappended so that the new message sequence becomes a multiple of 1.

[0114] Step 2: Modify the message block sequence by inserting some fixedbits at predetermined positions of each message block, for example byappending a predetermined bit sequence. In the example of FIG. 4, thesame bit sequence 403, labelled 0 in FIG. 4, is added to all blocks,resulting in the new message blocks 402 a-402 d.

[0115] Step 3: Apply different permutations on each message block 402a-d, resulting in the message blocks 404 a-d, and calculate the XOR sum407 a of the outputs 404 a-d of the permutations.

[0116] Step 4: Repeat step 3 y times, preferably using differentpermutations, resulting in the message blocks 405 a-d through 406 a-dand the corresponding XOR sums 407 a-d. An optimal choice of y maydepend on the message size and/or the error rate of the transmissionchannel.

[0117] Step 5: Concatenate all the different XOR sums 407 a-c into onehash value q 408.

[0118] The steps 1-5 may be repeated once or several times with the hashvalue 408 as input and with a new set of permutations, preferably with asmaller value of y, in order to generate a small hash sequence whileproviding a strong forgery protection.

[0119] Finally, the resulting hash value may be encrypted by taking forexample the XOR sum with the output of a pseudo-random function. Thenthe hash value, possibly encrypted, is the streaming MAC tag of themessage m.

[0120] In the following, still referring to FIG. 4, the steps of the MACmethod according to an embodiment of the invention will be described inmore detail.

[0121] In the following description, the following notation is used: ksecret key value. m message to be authenticated. z authenticationinformation or message tag. t₂ a threshold design integer value. qintermediate hash value. n binary length of the message m. L block sizeof message blocks. l the number of message blocks, l = ┌n/L┐. y thenumber of concatenated hashes. r the size of the sequences used forpadding.

[0122] Let a and b be indices labelling the repetitions of step 3 aboveand the message blocks, respectively, i.e. 1≦a≦y and 1≦b≦1. Denote by ka secret key value. Let P_(k, (a,b))(x) be a permutation that takes asinput a binary sequence x of length L and as output a permuted sequenceof x. We assume that P_(K,(a,b)) is completely determined by the secretkey K and the indices a,b. Preferably, given any (a,b), P_(k,(a,b)) isselected uniformly distributed over all possible permutations on the set{1,2, . . . ,L}. This is an advantage, because, given a series of Rpermutations,

P_(k,(a,b)) ₁ , P_(k(a,b)) ₂ , . . . , P_(k,(a,b)) _(R) ,

[0123] for any (a,b) ∉{(a,b)₁, (a,b)₂, . . . ,(a,b)_(R)}, it iscomputationally infeasible to extract information about P_(K,(a,b)) fromthe above series of permutations, if one does not have knowledge aboutthe key k.

[0124] Initially, let a=1.

[0125] Step 1: Let m be the message 401 to be protected, where m may bea message in a sequence of messages. Assume that m is represented as abinary sequence, m=m₁,m₂, . . . ,m_(n). Let e be the remainder of aninteger division of n with L. Concatenate m and a binary all zerosequence 0=0,0, . . . 0 of length e into a new message sequence m′=m,0the length of which is an integer multiple of L, i.e. 1*L=n+e. It isunderstood that instead of the binary all zero sequence 0 any other bitsequence of length e may be used. In the example of FIG. 4, e=0. Dividethe message sequence m,0 into blocks 401 a-401 d of size L, i.e.,m=m₁,m₂, . . . m_(L),m_(L+1),m_(L+2), . . . ,m_(2L),m_(2L+1), . . .,m_(L·(1−1)), . . . , m_(n),0,0, . . . , 0=m₁₁,m₁₂, . . .,m_(1j),,m₂₁,m₂₂, . . . ,m_(2L), . . . ,m₁₁,m₁₂, . . . ,m_(1L)=m′₁,m′₂,. . . ,m′₁.

[0126] Alternatively, other methods of dividing the message m intoblocks may be used. For example, bits from predetermined positions of mmay be concatenated into a block.

[0127] Step 2: Now, let b_(i)=b_(i1),b_(i2), . . . ,b_(ir), i=1, . . .,l be a set of l predetermined binary sequences of length r. Replace mwith v=M′₁,b₁,m′₂,b₂, . . . ,m′_(l),b_(l). The blocks m′₁,b₁, m′₂,b₂, .. . ,m′_(l),b_(l) are illustrated as blocks 402 a-402 d in FIG. 4. Forexample, all b_(i) may be chosen to be a binary all zero sequence oflength r. In a preferred embodiment, r may be chosen to be equal to t₂.

[0128] Step 3: Calculate

q _(a) =P _(k,(a,l))(v ₁)⊕(P_(k,(a,2))(v ₂)⊕ . . . ⊕P _(k(a,l))(v _(l))

[0129] Step 4: If a<y then let a=a+1 and go to Step 3, else continuewith Step 5.

[0130] Step 5: Generate q=q₁, q₂, . . . ,q_(y), i.e. a concatenation 508of the XOR sums 407 a-407 c calculated in step 3.

[0131] Step 6: Let PRF(k) be a binary sequence of a size equal to thesize of q. Preferably, PRF is a cryptographic secure pseudo-randomfunction. Let z=PRF(K)⊕q.

[0132] Step7: The transmitter sends (m,z) over the channel. The receiverreceives a pair (m′,z′). The receiver calculates the tag z″ of m′according to Step 1-6. If the distance d(z′,z″)<y·t₂, the message isaccepted, otherwise it is rejected. Preferably, the distance function dis the distance function δ described in connection with FIGS. 3a-b.Alternatively, another distance function may be used, e.g. the Hammingdistance or a distance function based on the Hamming distance.Furthermore, the threshold y·t₂ may be replaced by a differentthreshold.

[0133] If m is the i-th message in a sequence of messages, PRF may bechosen to be seeded by k and the index i, i.e. PRF(k,i).

[0134] It is understood that the PRF encryption may be replaced with adifferent encryption function.

[0135] The steps 1-5 may be repeated one or more times before continuingwith step 6, each time using the calculated hash value q as an input mof the next iteration and, preferably, using a different set ofpermutations in each iteration. Preferably, y is decreased in subsequentiterations in order to generate a hash value of a small size. The hashvalue q of the last iteration is then used in step 6.

[0136] It is understood that, instead of calculating the XOR sum of thepermutations in step 3, the partial hash values q, may be calculatedaccording to different rules. For example q_(a) may be calculatedrecursively as:

q _(a) P _(k,(a,1))( . . . (P _(k,(a,2))(v₁)⊕v₂)⊕ . . . )⊕₃u₁).

[0137] Furthermore, the order of the blocks v may be changed.

[0138] In the following, the method according to the embodimentdescribed in connection with FIG. 4 will be further illustrated by anexample:

[0139] Consider a message represented as a binary sequence m=1011 11110001, and consider the following parameters:

t₁=1, r=t₂=2, n=12, L=2, l=6, y=2

[0140] With these parameters y·l=12 different permutations on the setwith r+L=4 elements are needed. The permutation may be chosen at randomor according to a different selection rule. Furthermore, thepermutations may be part of the key. In this example the permutationsare assumed to be as follows:

P_(K,(1,1))={1,2,4,3}, P_(K,()1,2)={4,1,3,2}, P_(K,()1,3)={1, 4,2,3},P_(K,()1,4)={1, 3,4,2}, P_(K,()1,5)={1,2,3,4}, P_(K,()1,6)={1,3,4,2}.

P_(K,(2,1))={4,1,2,3}, P_(K,(2,2))={3,1,4,2}, P_(K,(2,3))={3,4,2,1},P_(K,(2,4))={2,4,1,3}, P_(K,(2,5))={1, 4,2,3}, P_(K,(2,6))={3,1,4,2}.

[0141] Furthermore, assume that m is the first message in a sequence andthat PRF(K,1)=1101 0011.

[0142] Below it is shown how to calculate z:

a=1.

[0143] Step 1: m is an even multiple of 2 and, therefore, it is notnecessary to pad any new bits to it (e=0).

[0144] Step 2: v=1000 1100 1100 1100 0000 0100.

[0145] Step 3:

q1=P_(K(1,1))(1000)⊕P _(K,(1,2))(1100)⊕P _(K,(1,3))(1100)⊕P_(K,(1,4))(1100)⊕P_(K,(1,5))(0000)⊕P_(K,(1,6))(0100)=1000⊕0101⊕1010⊕1001⊕0000⊕0001=1111.

[0146] Step 4: a=2. Go to Step 3.

[0147] Step 3:

q2=P _(K,(2,1))(1000)⊕P _(K,(2,2))(1100)⊕P _(K,(2,3))(1100)⊕P_(K,(2,4))(1100)⊕P _(K,(2,5))(0000)⊕P_(K,(2,6))(0100)=0100⊕0101⊕0011⊕1010⊕0000⊕0001=1001.

[0148] Step 4: a=2, Continue with Step 5.

[0149] Step 5: q=1111 1001.

[0150] Step 6: z=PRF(K,1)⊕q=1101 0011⊕1111 1001=0010 1010.

[0151] Step 7: Send (m,z)=(1011 1111 0001, 0010 1010).

[0152] It is noted that it is possible to show that an adversary whotries to replace m with m′, where h(m,m′)>1, succeeds with a probabilitythat is less than (⅝)².

[0153]FIG. 5 shows a block diagram of a communications system accordingto an embodiment of the invention. The system comprises a transmitter501 and a receiver 506 which communicate via a transmission channel 505.The transmitter comprises a processing unit 503 which is connected to astorage medium 502 and a transmitter unit 504. The receiver 506comprises a processing unit 508 which is connected to a storage medium507 and a receiver unit 509.

[0154] On the storage medium 502 of the transmitter 501,computer-executable code is stored which, when loaded in the processingunit 503, is adapted to implement a MAC algorithm according to theinvention. Furthermore, relevant parameters for the MAC algorithm, suchas a secret key, threshold values, block lengths, etc., are also storedon the storage medium 502. The message payload to be transmitted mayalso be stored on the storage medium 502. The communication between thetransmitter 501 and the receiver 506 may, for example, be implemented asa layered protocol stack, e.g. according to the OSI model. At thetransmitter, one of the layers of the protocol stack may include animplementation of the MAC algorithm according to the invention, wherethe MAC algorithm receives a message payload from a higher layer, andthe resulting tag value is combined with the message payload and sent toa lower layer of the layered protocol stack which initiates thetransmission of the message via the transmitter unit 504. At thereceiver the message is received by the receiver unit 509 and processedby the lowest layers of the protocol stack at the receiver. The receivedmessage is routed to the processor 508. On the storage medium 507computer-executable code is stored which is adapted to implement thecorresponding MAC algorithm at the receiver 506 when loaded in theprocessing unit 508. Furthermore, the corresponding parameters for theMAC algorithm are also stored on the storage medium 503. The messagepayload and the received tag value are forwarded to the MAC programmodule executed on the processor 508. Based on the comparison of tagvalues, the received message may either be passed to a higher layer ofthe protocol stack or a re-transmission of the message may be initiated.Furthermore, the message may be stored on the storage medium 507.

[0155] The storage media may include magnetic tape, optical disc,digital video disk (DVD), compact disc (CD or CD-ROM), mini-disc, harddisk, floppy disk, ferro-electric memory, electrically erasableprogrammable read only memory (EEPROM), flash memory, EPROM, read onlymemory (ROM), static random access memory (SRAM), dynamic random accessmemory (DRAM), ferromagnetic memory, optical storage, charge coupleddevices, smart cards, etc.

[0156] The processing units may include a microprocessor, anapplication-specific integrated circuit, or another integrated circuit,a smart card, or the like.

[0157]FIG. 6a shows a first example of a message format according to anembodiment of the invention. The message comprises a header 601comprising information such as routing information, information aboutthe length of the following message, information about whetherauthentication is to be applied, sender identification, etc. The messagefurther comprises a message body 602 comprising the information to betransmitted and the tag value 603 calculated according to a MAC functionaccording to the invention.

[0158]FIG. 6b shows a second example of a message format according to anembodiment of the invention. According to this example, the message,comprising a header 601, a message body 602 and a tag value 603, isdivided into smaller data packets 604 a-e. The division into the packets604 a-e may, for example, be performed by a lower layer of a protocolstack at the transmitter, and each of the packets 604 a-e may includeheader information according to the communications protocol used. At thereceiver, the message 602 and the tag value 603 are reconstructed fromthe received smaller messages 404 a-e before the authentication check isperformed.

[0159]FIG. 6c shows a third example of a message format according to anembodiment of the invention. According to this example, the message body602 is divided into smaller messages 605 b, 606 b, and 607 b, andrespective tag values 605 c, 606 c, and 607 c are calculated for each ofthe smaller messages 605 b, 606 b, and 607 b. Subsequently, the messages605-607, each message comprising respective header information 605 a,606 a, 607 a, respectively, message bodies 605 b, 606 b, 607 b,respectively, and tag values 605 c, 606 c, 607 c, respectively, are sentto the receiver. At the receiver, the authentication check is performedfor each of the messages 605-607 prior to the reconstruction of themessage 602 from the message bodies, 605 b, 606 b, and 607 b.

[0160] It is understood that other message formats may also be usedwithin the scope of the invention. For example, the tag value may becombined with the message body in a different way, e.g. by prependingthe tag value or placing it at predetermined positions within themessage body. Furthermore, depending on the transmission protocol used,other ways of splitting up the message and/or including headerinformation may be used, including the use of no header information orthe sending of the tag value and the message body separately.

1. A method of authenticating data, the method comprising the steps ofreceiving a message (114, 203) and a corresponding first data item (116,213) generated according to a first predetermined rule (102, 206);generating a second data item (115, 210) according to a secondpredetermined rule (112, 206) on the basis of the received message;calculating a first distance (117, 215) between the received first dataitem and the generated second data item; comparing the calculated firstdistance with a predetermined distance value (212).
 2. A methodaccording to claim 1, characterised in that the method further comprisesthe step of processing the message conditioned on a result of the stepof comparing the calculated first distance with a predetermined distancevalue.
 3. A method according to claim 1 or 2, characterised in that thestep of processing the message comprises the step of accepting themessage if the calculated first distance is smaller than thepredetermined distance value.
 4. A method according to any one of claims1 through 3, characterised in that the message and the first data itemare received via a wireless communications channel.
 5. A methodaccording to any one of claims 1 through 4, characterised in that thestep of generating the second data item comprises the step of applying apredetermined permutation to a third data item derived from the message.6. A method according to any one of claims 1 through 5, characterised inthat the step of generating the second data item comprises the step ofcombining a fourth data item derived from the message with apredetermined fifth data item.
 7. A method according to claim 6,characterised in that the step of combining a fourth data item derivedfrom the message with a predetermined fifth data item comprises the stepof inserting predetermined binary sequences at predetermined positionsof the fourth data item.
 8. A method according to any one of claims 1through 7, characterised in that the step of calculating a firstdistance comprises the step of calculating a Hamming distance.
 9. Amethod according to any one of claims 1 through 8, characterised in thatthe method further comprises the steps of a) generating a first sequenceof message sections from the message, each message section having apredetermined length; b) modifying at least a first message section ofthe sequence of message sections; c) applying at least a firstpermutation to at least the modified first message section of thesequence of message sections; d) calculating at least one XOR sum of aresult of at least the first permutation; e) calculating a hash valuefrom the calculated at least one XOR sum.
 10. A method according toclaim 9, characterised in that the method further comprises the step ofrepeating the steps of applying at least the first permutation andcalculating an XOR sum a predetermined number of times.
 11. A methodaccording to claim 9 or 10, characterised in that the method furthercomprises the step of repeating the steps a) through e) using at least asecond permutation.
 12. A method according to any one of claims 9through 11, characterised in that the method further comprises the stepof encrypting the calculated hash value.
 13. A method according to anyone of claims 1 through 12, characterised in that the step ofcalculating the first distance further comprises the steps of dividingthe first and second data items into corresponding first and second setsof data sections; calculating a second distance between a first sectionof the first set of sections and a corresponding second section of thesecond set of sections; and comparing the second distance with apredetermined threshold value.
 14. A method according to any one ofclaims 1 through 13, characterised in that the first data item isfurther generated on the basis of a first secret key code; and the stepof generating the second data item on the basis of the received messagefurther comprises the step of generating the second data item on thebasis of a second secret key code.
 15. A method according to any one ofclaims 1 through 14, characterised in that the first and second secretkey codes are the same key code; and the first and second predeterminedrules are the same rule.
 16. A method of transmitting a message from atransmitter (101, 501) to a receiver (110, 506) via a transmissionchannel (108, 505), the method comprising the steps of at thetransmitter generating a first data item (105, 209) according to a firstpredetermined rule (102, 206) on the basis of the message (104, 202);transmitting the message and the generated first data item from thetransmitter to the receiver; generating a second data item (115, 210)according to a second predetermined rule (112, 206) on the basis of thereceived message (114, 203); calculating a first distance (215) betweenthe received first data item (116, 213) and the generated second dataitem; comparing the calculated first distance with a predetermineddistance value (212).
 17. A method according to claim 16, characterisedin that the generated first data item has a size which is smaller than asize of the message.
 18. A method according to claim 16 or 17,characterised in that the step of generating the first data itemcomprises the step of calculating a hash function on the basis of asixth data item derived from the message.
 19. A communications systemcomprising first processing means (503) adapted to calculate a firstdata item according to a first predetermined rule on the basis of amessage; a transmitter (504) adapted to transmit the message and thegenerated first data item via a transmission channel (505); a receiver(509) adapted to receive the transmitted message and the transmittedfirst data item; second processing means (508) adapted to generate asecond data item according to a second predetermined rule on the basisof the received message; to calculate a first distance between thereceived first data item and the generated second data item; and tocompare the calculated first distance with a predetermined distancevalue.
 20. An apparatus comprising a receiver (509) adapted to receive amessage and a corresponding first data item generated according to afirst predetermined rule; first processing means (508) adapted togenerate a second data item according to a second predetermined rule onthe basis of the received message; to calculate a first distance betweenthe received first data item and the generated second data item; and tocompare the calculated first distance with a predetermined distancevalue.
 21. An apparatus according to claim 20, characterised in that theapparatus is a mobile radio terminal.
 22. A data signal embodied in acarrier wave for use in a method according to any one of the claims 1through 15, the data signal comprising a message and a first data item.23. A computer program comprising program code means for performing allthe steps of any one of the claims 1 through 15 when said program is runon a microprocessor.
 24. A computer program product comprising programcode means stored on a computer readable medium for performing themethod of any one of the claims 1 through 15 when said computer programproduct is run on a microprocessor.